10. 解决方案:爱上子查询

以下视频演示了解决第一个问题的步骤。每个问题答案都以文本的形式显示在下面。

子查询解决方案

解决方案:爱上子查询

  1. 提供每个 区域 拥有最高销售额 ( total_amt_usd ) 的 销售代表 姓名


    首先,我要算出与每个 销售代表 相关的总销售额 ( total_amt_usd ),并且要得出他们所在的区域。以下查询提供了这一信息。
SELECT s.name rep_name, r.name region_name, SUM(o.total_amt_usd) total_amt
FROM sales_reps s
JOIN accounts a
ON a.sales_rep_id = s.id
JOIN orders o
ON o.account_id = a.id
JOIN region r
ON r.id = s.region_id
GROUP BY 1,2
ORDER BY 3 DESC;

接着,得出每个区域的最高销售额,然后使用该信息从最终结果中获取这些行。

SELECT region_name, MAX(total_amt) total_amt
FROM(SELECT s.name rep_name, r.name region_name, SUM(o.total_amt_usd) total_amt
FROM sales_reps s
JOIN accounts a
ON a.sales_rep_id = s.id
JOIN orders o
ON o.account_id = a.id
JOIN region r
ON r.id = s.region_id
GROUP BY 1, 2) inner1
GROUP BY 1;

本质上,这是两个表格的 连接 ,其中区域和销售额相匹配。

SELECT t1.rep_name, t1.region_name, t1.total_amt
FROM(SELECT s.name rep_name, r.name region_name, SUM(o.total_amt_usd) total_amt
FROM sales_reps s
JOIN accounts a
ON a.sales_rep_id = s.id
JOIN orders o
ON o.account_id = a.id
JOIN region r
ON r.id = s.region_id
GROUP BY 1,2
ORDER BY 3 DESC) t1
JOIN (SELECT region_name, MAX(total_amt) total_amt
FROM(SELECT s.name rep_name, r.name region_name, SUM(o.total_amt_usd) total_amt
FROM sales_reps s
JOIN accounts a
ON a.sales_rep_id = s.id
JOIN orders o
ON o.account_id = a.id
JOIN region r
ON r.id = s.region_id
GROUP BY 1, 2) inner1
GROUP BY 1) t2
ON t1.region_name = t2.region_name AND t1.total_amt = t2.total_amt;
  1. 对于具有最高销售额 ( total_amt_usd ) 的区域,总共下了多少个订单?


    我写的第一个查询是获取每个 区域 total_amt_usd
SELECT r.name region_name, SUM(o.total_amt_usd) total_amt
FROM sales_reps s
JOIN accounts a
ON a.sales_rep_id = s.id
JOIN orders o
ON o.account_id = a.id
JOIN region r
ON r.id = s.region_id
GROUP BY r.name;

然后,我们仅从该表格中获取销售额最高的区域。可以通过两种方法来获取,一种是使用子查询后的最大值,另一种是按降序排序,然后获取最高值。

SELECT MAX(total_amt)
FROM (SELECT r.name region_name, SUM(o.total_amt_usd) total_amt
FROM sales_reps s
JOIN accounts a
ON a.sales_rep_id = s.id
JOIN orders o
ON o.account_id = a.id
JOIN region r
ON r.id = s.region_id
GROUP BY r.name) sub;

最终,我们要获取具有该区域销售额的总订单量:

SELECT r.name, SUM(o.total) total_orders
FROM sales_reps s
JOIN accounts a
ON a.sales_rep_id = s.id
JOIN orders o
ON o.account_id = a.id
JOIN region r
ON r.id = s.region_id
GROUP BY r.name
HAVING SUM(o.total_amt_usd) = (
SELECT MAX(total_amt)
FROM (SELECT r.name region_name, SUM(o.total_amt_usd) total_amt
FROM sales_reps s
JOIN accounts a
ON a.sales_rep_id = s.id
JOIN orders o
ON o.account_id = a.id
JOIN region r
ON r.id = s.region_id
GROUP BY r.name) sub);

结果就是 Northeast ,总订单为 1230378 个。

  1. 对于购买标准纸张数量 ( standard_qty ) 最多的客户(在作为客户的整个时期内), 有多少客户 的购买总数依然更多?


    首先,我们要得出购买标准纸张数量 ( standard_qty ) 最多的客户。以下查询获取了该客户,以及总消费:
SELECT a.name account_name, SUM(o.standard_qty) total_std, SUM(o.total) total
FROM accounts a
JOIN orders o
ON o.account_id = a.id
GROUP BY 1
ORDER BY 2 DESC
LIMIT 1;

现在,我将使用上述信息获取总消费更高的所有客户:

SELECT a.name
FROM orders o
JOIN accounts a
ON a.id = o.account_id
GROUP BY 1
HAVING SUM(o.total) > (SELECT total
FROM (SELECT a.name act_name, SUM(o.standard_qty) tot_std, SUM(o.total) total
FROM accounts a
JOIN orders o
ON o.account_id = a.id
GROUP BY 1
ORDER BY 2 DESC
LIMIT 1) sub);

上述查询列出了具有更多订单的客户列表。我们还可以使用另一个简单的子查询获取数量。

SELECT COUNT(*)
FROM (SELECT a.name
FROM orders o
JOIN accounts a
ON a.id = o.account_id
GROUP BY 1
HAVING SUM(o.total) > (SELECT total
FROM (SELECT a.name act_name, SUM(o.standard_qty) tot_std, SUM(o.total) total
FROM accounts a
JOIN orders o
ON o.account_id = a.id
GROUP BY 1
ORDER BY 2 DESC
LIMIT 1) inner_tab)
) counter_tab;
  1. 对于(在作为客户的整个时期内)总消费 ( total_amt_usd ) 最多的客户,他们在每个渠道上有多少 web_events


    我们首先需要获取在整个客户时期内消费最多的客户。
SELECT a.id, a.name, SUM(o.total_amt_usd) tot_spent
FROM orders o
JOIN accounts a
ON a.id = o.account_id
GROUP BY a.id, a.name
ORDER BY 3 DESC
LIMIT 1;

现在,我们要获取该企业(可以使用 id 进行匹配)在每个渠道上的事件数。

SELECT a.name, w.channel, COUNT(*)
FROM accounts a
JOIN web_events w
ON a.id = w.account_id AND a.id =  (SELECT id
FROM (SELECT a.id, a.name, SUM(o.total_amt_usd) tot_spent
FROM orders o
JOIN accounts a
ON a.id = o.account_id
GROUP BY a.id, a.name
ORDER BY 3 DESC
LIMIT 1) inner_table)
GROUP BY 1, 2
ORDER BY 3 DESC;

我添加了 ORDER BY ,其实并没特别的理由,并添加了客户名称,确保仅从一个客户那获取数据。

  1. 对于总消费前十名的客户,他们的平均终身消费 ( total_amt_usd ) 是多少?


    首先,我们需要找出总消费 ( total_amt_usd ) 在前十名的客户。
SELECT a.id, a.name, SUM(o.total_amt_usd) tot_spent
FROM orders o
JOIN accounts a
ON a.id = o.account_id
GROUP BY a.id, a.name
ORDER BY 3 DESC
LIMIT 10;

现在计算这十个客户的平均消费。

SELECT AVG(tot_spent)
FROM (SELECT a.id, a.name, SUM(o.total_amt_usd) tot_spent
FROM orders o
JOIN accounts a
ON a.id = o.account_id
GROUP BY a.id, a.name
ORDER BY 3 DESC
LIMIT 10) temp;


  1. 比所有客户的平均消费高的企业平均终身消费 ( total_amt_usd ) 是多少?


    首先,算出所有客户的总消费 ( total_amt_usd ) 平均值:
SELECT AVG(o.total_amt_usd) avg_all
FROM orders o
JOIN accounts a
ON a.id = o.account_id;

然后,只获取高于这一平均值的客户。

SELECT o.account_id, AVG(o.total_amt_usd)
FROM orders o
GROUP BY 1
HAVING AVG(o.total_amt_usd) > (SELECT AVG(o.total_amt_usd) avg_all
FROM orders o
JOIN accounts a
ON a.id = o.account_id);

最后,算出这些值的平均值。

SELECT AVG(avg_amt)
FROM (SELECT o.account_id, AVG(o.total_amt_usd) avg_amt
FROM orders o
GROUP BY 1
HAVING AVG(o.total_amt_usd) > (SELECT AVG(o.total_amt_usd) avg_all
FROM orders o
JOIN accounts a
ON a.id = o.account_id)) temp_table;